Overview

Dataset statistics

Number of variables12
Number of observations826
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory71.1 KiB
Average record size in memory88.2 B

Variable types

Numeric10
Categorical2

Alerts

Urea is highly correlated with CrHigh correlation
Cr is highly correlated with UreaHigh correlation
HbA1c is highly correlated with AGE and 2 other fieldsHigh correlation
TG is highly correlated with VLDLHigh correlation
VLDL is highly correlated with TG and 1 other fieldsHigh correlation
BMI is highly correlated with AGE and 3 other fieldsHigh correlation
CLASS is highly correlated with AGE and 2 other fieldsHigh correlation
AGE is highly correlated with HbA1c and 2 other fieldsHigh correlation
Chol is highly correlated with LDLHigh correlation
LDL is highly correlated with CholHigh correlation

Reproduction

Analysis started2022-10-11 05:59:28.051922
Analysis finished2022-10-11 06:00:49.930697
Duration1 minute and 21.88 seconds
Software versionpandas-profiling v3.3.0
Download configurationconfig.json

Variables

AGE
Real number (ℝ≥0)

HIGH CORRELATION

Distinct50
Distinct (%)6.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean53.49031477
Minimum20
Maximum79
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.6 KiB
2022-10-11T11:30:50.354112image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum20
5-th percentile33.25
Q151
median55
Q359
95-th percentile65.75
Maximum79
Range59
Interquartile range (IQR)8

Descriptive statistics

Standard deviation8.808427268
Coefficient of variation (CV)0.1646733115
Kurtosis1.446224746
Mean53.49031477
Median Absolute Deviation (MAD)4
Skewness-0.811071565
Sum44183
Variance77.58839093
MonotonicityNot monotonic
2022-10-11T11:30:50.717442image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
55145
17.6%
6076
 
9.2%
5470
 
8.5%
5142
 
5.1%
6142
 
5.1%
5637
 
4.5%
5236
 
4.4%
5034
 
4.1%
5927
 
3.3%
5725
 
3.0%
Other values (40)292
35.4%
ValueCountFrequency (%)
201
 
0.1%
251
 
0.1%
262
 
0.2%
283
 
0.4%
3015
1.8%
317
0.8%
321
 
0.1%
3312
1.5%
344
 
0.5%
3510
1.2%
ValueCountFrequency (%)
791
 
0.1%
773
0.4%
764
0.5%
752
 
0.2%
735
0.6%
711
 
0.1%
702
 
0.2%
695
0.6%
687
0.8%
675
0.6%

Urea
Real number (ℝ≥0)

HIGH CORRELATION

Distinct110
Distinct (%)13.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.184676755
Minimum0.5
Maximum38.9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.6 KiB
2022-10-11T11:30:51.071657image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0.5
5-th percentile2.3
Q13.615
median4.6
Q35.7
95-th percentile10
Maximum38.9
Range38.4
Interquartile range (IQR)2.085

Descriptive statistics

Standard deviation3.077831319
Coefficient of variation (CV)0.5936399633
Kurtosis29.19738507
Mean5.184676755
Median Absolute Deviation (MAD)1.08
Skewness4.258575864
Sum4282.543
Variance9.473045628
MonotonicityNot monotonic
2022-10-11T11:30:51.457101image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
530
 
3.6%
4.327
 
3.3%
427
 
3.3%
4.126
 
3.1%
323
 
2.8%
4.823
 
2.8%
4.722
 
2.7%
3.821
 
2.5%
4.521
 
2.5%
4.621
 
2.5%
Other values (100)585
70.8%
ValueCountFrequency (%)
0.51
 
0.1%
1.11
 
0.1%
1.21
 
0.1%
1.82
 
0.2%
1.91
 
0.1%
216
1.9%
2.113
1.6%
2.25
 
0.6%
2.37
0.8%
2.44
 
0.5%
ValueCountFrequency (%)
38.91
 
0.1%
26.41
 
0.1%
243
0.4%
222
0.2%
20.83
0.4%
201
 
0.1%
17.11
 
0.1%
14.91
 
0.1%
14.52
0.2%
14.11
 
0.1%

Cr
Real number (ℝ≥0)

HIGH CORRELATION

Distinct113
Distinct (%)13.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean69.02421308
Minimum6
Maximum800
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.6 KiB
2022-10-11T11:30:51.836023image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum6
5-th percentile33
Q148
median59
Q373
95-th percentile112.75
Maximum800
Range794
Interquartile range (IQR)25

Descriptive statistics

Standard deviation59.55710797
Coefficient of variation (CV)0.8628437083
Kurtosis87.03510647
Mean69.02421308
Median Absolute Deviation (MAD)13
Skewness8.148519109
Sum57014
Variance3547.04911
MonotonicityNot monotonic
2022-10-11T11:30:52.205936image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5636
 
4.4%
5527
 
3.3%
7023
 
2.8%
7222
 
2.7%
5322
 
2.7%
6221
 
2.5%
4821
 
2.5%
5920
 
2.4%
6119
 
2.3%
4919
 
2.3%
Other values (103)596
72.2%
ValueCountFrequency (%)
61
 
0.1%
202
0.2%
222
0.2%
233
0.4%
242
0.2%
251
 
0.1%
261
 
0.1%
272
0.2%
284
0.5%
291
 
0.1%
ValueCountFrequency (%)
8003
0.4%
4013
0.4%
3702
0.2%
3441
 
0.1%
3271
 
0.1%
3151
 
0.1%
2431
 
0.1%
2301
 
0.1%
2281
 
0.1%
2031
 
0.1%

HbA1c
Real number (ℝ≥0)

HIGH CORRELATION

Distinct111
Distinct (%)13.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.326343826
Minimum0.9
Maximum16
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.6 KiB
2022-10-11T11:30:52.560168image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0.9
5-th percentile4.1
Q16.5
median8.1
Q310.2
95-th percentile12.575
Maximum16
Range15.1
Interquartile range (IQR)3.7

Descriptive statistics

Standard deviation2.602589004
Coefficient of variation (CV)0.3125728482
Kurtosis-0.3397004534
Mean8.326343826
Median Absolute Deviation (MAD)1.9
Skewness0.1825979019
Sum6877.56
Variance6.773469525
MonotonicityNot monotonic
2022-10-11T11:30:52.907906image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
430
 
3.6%
829
 
3.5%
927
 
3.3%
723
 
2.8%
6.822
 
2.7%
520
 
2.4%
10.219
 
2.3%
617
 
2.1%
8.516
 
1.9%
7.716
 
1.9%
Other values (101)607
73.5%
ValueCountFrequency (%)
0.93
 
0.4%
21
 
0.1%
31
 
0.1%
3.74
 
0.5%
430
3.6%
4.17
 
0.8%
4.25
 
0.6%
4.39
 
1.1%
4.55
 
0.6%
4.61
 
0.1%
ValueCountFrequency (%)
161
 
0.1%
15.91
 
0.1%
152
0.2%
14.81
 
0.1%
14.73
0.4%
14.62
0.2%
14.52
0.2%
14.41
 
0.1%
14.11
 
0.1%
13.92
0.2%

Chol
Real number (ℝ≥0)

HIGH CORRELATION

Distinct77
Distinct (%)9.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.898208232
Minimum0
Maximum10.3
Zeros1
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size6.6 KiB
2022-10-11T11:30:53.277763image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3
Q14
median4.8
Q35.6
95-th percentile7.2
Maximum10.3
Range10.3
Interquartile range (IQR)1.6

Descriptive statistics

Standard deviation1.328811612
Coefficient of variation (CV)0.2712852433
Kurtosis1.69611274
Mean4.898208232
Median Absolute Deviation (MAD)0.8
Skewness0.575506994
Sum4045.92
Variance1.765740301
MonotonicityNot monotonic
2022-10-11T11:30:53.625489image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4.434
 
4.1%
4.934
 
4.1%
4.832
 
3.9%
5.332
 
3.9%
430
 
3.6%
4.230
 
3.6%
5.229
 
3.5%
4.128
 
3.4%
4.625
 
3.0%
3.624
 
2.9%
Other values (67)528
63.9%
ValueCountFrequency (%)
01
 
0.1%
0.51
 
0.1%
0.61
 
0.1%
1.21
 
0.1%
25
0.6%
2.12
 
0.2%
2.33
0.4%
2.44
0.5%
2.54
0.5%
2.64
0.5%
ValueCountFrequency (%)
10.31
0.1%
9.91
0.1%
9.82
0.2%
9.72
0.2%
9.52
0.2%
9.31
0.1%
9.21
0.1%
9.11
0.1%
8.82
0.2%
8.61
0.1%

TG
Real number (ℝ≥0)

HIGH CORRELATION

Distinct69
Distinct (%)8.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.39937046
Minimum0.3
Maximum13.8
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.6 KiB
2022-10-11T11:30:53.979729image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0.3
5-th percentile0.8
Q11.5
median2.015
Q33
95-th percentile5.1
Maximum13.8
Range13.5
Interquartile range (IQR)1.5

Descriptive statistics

Standard deviation1.456849516
Coefficient of variation (CV)0.6071799
Kurtosis10.14366515
Mean2.39937046
Median Absolute Deviation (MAD)0.715
Skewness2.294614502
Sum1981.88
Variance2.122410512
MonotonicityNot monotonic
2022-10-11T11:30:54.581544image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2.146
 
5.6%
245
 
5.4%
1.339
 
4.7%
1.536
 
4.4%
1.735
 
4.2%
1.933
 
4.0%
2.231
 
3.8%
1.830
 
3.6%
1.630
 
3.6%
1.230
 
3.6%
Other values (59)471
57.0%
ValueCountFrequency (%)
0.32
 
0.2%
0.51
 
0.1%
0.612
 
1.5%
0.722
2.7%
0.816
1.9%
0.912
 
1.5%
122
2.7%
1.125
3.0%
1.192
 
0.2%
1.230
3.6%
ValueCountFrequency (%)
13.81
 
0.1%
12.71
 
0.1%
11.61
 
0.1%
8.71
 
0.1%
8.51
 
0.1%
7.72
0.2%
7.22
0.2%
73
0.4%
6.82
0.2%
6.72
0.2%

HDL
Real number (ℝ≥0)

Distinct48
Distinct (%)5.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.211803874
Minimum0.2
Maximum9.9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.6 KiB
2022-10-11T11:30:54.929295image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0.2
5-th percentile0.7
Q10.9
median1.1
Q31.3
95-th percentile1.9
Maximum9.9
Range9.7
Interquartile range (IQR)0.4

Descriptive statistics

Standard deviation0.6796098812
Coefficient of variation (CV)0.5608249781
Kurtosis62.70008671
Mean1.211803874
Median Absolute Deviation (MAD)0.2
Skewness6.29831597
Sum1000.95
Variance0.4618695906
MonotonicityNot monotonic
2022-10-11T11:30:55.299168image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=48)
ValueCountFrequency (%)
0.9120
14.5%
1.1113
13.7%
1105
12.7%
0.870
8.5%
1.365
7.9%
1.260
7.3%
0.747
 
5.7%
1.446
 
5.6%
1.635
 
4.2%
1.828
 
3.4%
Other values (38)137
16.6%
ValueCountFrequency (%)
0.21
 
0.1%
0.47
 
0.8%
0.56
 
0.7%
0.620
 
2.4%
0.747
 
5.7%
0.752
 
0.2%
0.870
8.5%
0.9120
14.5%
0.951
 
0.1%
1105
12.7%
ValueCountFrequency (%)
9.91
0.1%
91
0.1%
6.61
0.1%
6.31
0.1%
51
0.1%
41
0.1%
3.91
0.1%
3.81
0.1%
3.62
0.2%
3.41
0.1%

LDL
Real number (ℝ≥0)

HIGH CORRELATION

Distinct65
Distinct (%)7.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.590060533
Minimum0.3
Maximum9.9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.6 KiB
2022-10-11T11:30:55.646883image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0.3
5-th percentile1
Q11.7
median2.5
Q33.3
95-th percentile4.3
Maximum9.9
Range9.6
Interquartile range (IQR)1.6

Descriptive statistics

Standard deviation1.132863199
Coefficient of variation (CV)0.4373886958
Kurtosis3.023358049
Mean2.590060533
Median Absolute Deviation (MAD)0.8
Skewness1.005128643
Sum2139.39
Variance1.283379027
MonotonicityNot monotonic
2022-10-11T11:30:56.001156image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2.549
 
5.9%
243
 
5.2%
1.735
 
4.2%
333
 
4.0%
1.432
 
3.9%
3.632
 
3.9%
2.631
 
3.8%
3.128
 
3.4%
1.327
 
3.3%
2.125
 
3.0%
Other values (55)491
59.4%
ValueCountFrequency (%)
0.31
 
0.1%
0.52
 
0.2%
0.63
 
0.4%
0.72
 
0.2%
0.753
 
0.4%
0.86
 
0.7%
0.919
2.3%
0.954
 
0.5%
0.961
 
0.1%
19
1.1%
ValueCountFrequency (%)
9.91
 
0.1%
7.92
0.2%
7.51
 
0.1%
71
 
0.1%
6.41
 
0.1%
5.91
 
0.1%
5.63
0.4%
5.53
0.4%
5.31
 
0.1%
5.11
 
0.1%

VLDL
Real number (ℝ≥0)

HIGH CORRELATION

Distinct60
Distinct (%)7.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.774576271
Minimum0.1
Maximum35
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.6 KiB
2022-10-11T11:30:56.364487image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0.1
5-th percentile0.4
Q10.7
median1
Q31.5
95-th percentile4.775
Maximum35
Range34.9
Interquartile range (IQR)0.8

Descriptive statistics

Standard deviation3.517930883
Coefficient of variation (CV)1.982406133
Kurtosis40.18481999
Mean1.774576271
Median Absolute Deviation (MAD)0.4
Skewness5.89318668
Sum1465.8
Variance12.3758377
MonotonicityNot monotonic
2022-10-11T11:30:56.734363image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.986
 
10.4%
0.774
 
9.0%
165
 
7.9%
0.865
 
7.9%
0.660
 
7.3%
1.559
 
7.1%
1.149
 
5.9%
0.543
 
5.2%
1.337
 
4.5%
235
 
4.2%
Other values (50)253
30.6%
ValueCountFrequency (%)
0.12
 
0.2%
0.25
 
0.6%
0.323
 
2.8%
0.432
 
3.9%
0.543
5.2%
0.660
7.3%
0.774
9.0%
0.865
7.9%
0.986
10.4%
165
7.9%
ValueCountFrequency (%)
351
0.1%
33.61
0.1%
31.81
0.1%
311
0.1%
27.21
0.1%
24.51
0.1%
22.71
0.1%
22.21
0.1%
19.51
0.1%
18.11
0.1%

BMI
Real number (ℝ≥0)

HIGH CORRELATION

Distinct64
Distinct (%)7.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean29.45927361
Minimum19
Maximum47.75
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.6 KiB
2022-10-11T11:30:57.088624image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum19
5-th percentile21
Q126
median30
Q333
95-th percentile38
Maximum47.75
Range28.75
Interquartile range (IQR)7

Descriptive statistics

Standard deviation4.996675763
Coefficient of variation (CV)0.1696129996
Kurtosis-0.425355565
Mean29.45927361
Median Absolute Deviation (MAD)3
Skewness0.08120902248
Sum24333.36
Variance24.96676868
MonotonicityNot monotonic
2022-10-11T11:30:57.436317image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3091
 
11.0%
3385
 
10.3%
2957
 
6.9%
2452
 
6.3%
2650
 
6.1%
3146
 
5.6%
2842
 
5.1%
2741
 
5.0%
3240
 
4.8%
2135
 
4.2%
Other values (54)287
34.7%
ValueCountFrequency (%)
196
 
0.7%
19.51
 
0.1%
2010
 
1.2%
2135
4.2%
21.171
 
0.1%
2229
3.5%
22.51
 
0.1%
2332
3.9%
23.51
 
0.1%
2452
6.3%
ValueCountFrequency (%)
47.751
 
0.1%
471
 
0.1%
43.251
 
0.1%
40.51
 
0.1%
401
 
0.1%
39.181
 
0.1%
3918
2.2%
38.621
 
0.1%
3821
2.5%
37.621
 
0.1%

Gender
Categorical

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.6 KiB
1
463 
0
363 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters826
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
1463
56.1%
0363
43.9%

Length

2022-10-11T11:30:57.752803image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-10-11T11:30:58.138407image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
1463
56.1%
0363
43.9%

Most occurring characters

ValueCountFrequency (%)
1463
56.1%
0363
43.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number826
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1463
56.1%
0363
43.9%

Most occurring scripts

ValueCountFrequency (%)
Common826
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1463
56.1%
0363
43.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII826
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1463
56.1%
0363
43.9%

CLASS
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size6.6 KiB
2
690 
0
96 
1
 
40

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters826
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
2690
83.5%
096
 
11.6%
140
 
4.8%

Length

2022-10-11T11:30:58.423623image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-10-11T11:30:58.740083image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
2690
83.5%
096
 
11.6%
140
 
4.8%

Most occurring characters

ValueCountFrequency (%)
2690
83.5%
096
 
11.6%
140
 
4.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number826
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2690
83.5%
096
 
11.6%
140
 
4.8%

Most occurring scripts

ValueCountFrequency (%)
Common826
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2690
83.5%
096
 
11.6%
140
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII826
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2690
83.5%
096
 
11.6%
140
 
4.8%

Interactions

2022-10-11T11:30:44.844801image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:14.899744image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:18.340786image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:21.480647image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:24.858863image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:28.151971image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:31.646064image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:34.870583image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:37.994802image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:41.335145image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:45.161226image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:15.548558image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:18.657230image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:21.803633image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:25.159222image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:28.484027image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:31.962506image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:35.187008image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:38.295629image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:41.689421image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:45.499756image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:15.902847image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:18.973700image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:22.120145image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:25.475723image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:28.800480image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:32.332334image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:35.503485image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:38.612008image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:42.090510image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:45.847432image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:16.219320image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:19.296734image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:22.684041image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:25.792167image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:29.123459image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:32.680071image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:35.819946image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:39.182479image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:42.422548image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:46.232886image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:16.520130image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:19.597559image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:23.000466image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:26.108645image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:29.439911image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:32.996507image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:36.158587image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:39.498986image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:42.761159image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:46.630372image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:16.820977image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:19.929660image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:23.307835image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:26.415978image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:29.756340image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:33.297350image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:36.475029image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:39.799830image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:43.108840image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:47.056134image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:17.137432image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:20.246109image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:23.608670image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:26.732441image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:30.104039image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:33.613826image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:36.775843image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:40.116272image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:43.510026image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:47.477369image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:17.422644image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:20.546945image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:23.925106image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:27.064514image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:30.404910image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:33.914702image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:37.076706image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:40.417073image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:43.842053image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:48.241713image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:17.723501image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:20.863332image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:24.225905image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:27.434323image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:30.975375image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:34.231136image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:37.377516image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:40.717911image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:44.196272image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:48.696155image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:18.039943image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:21.179806image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:24.542394image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:27.819836image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:31.307449image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:34.554111image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:37.693939image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:41.018728image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-10-11T11:30:44.528317image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Correlations

2022-10-11T11:30:59.025340image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-10-11T11:30:59.442199image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-10-11T11:30:59.827637image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-10-11T11:31:00.429343image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-10-11T11:31:00.761454image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-10-11T11:30:49.244364image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
A simple visualization of nullity by column.
2022-10-11T11:30:49.752215image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

AGEUreaCrHbA1cCholTGHDLLDLVLDLBMIGenderCLASS
0504.7464.94.20.92.41.40.524.000
1264.5624.93.71.41.12.10.623.010
2337.1464.94.91.00.82.00.421.010
3452.3244.02.91.01.01.50.421.000
4502.0504.03.61.30.92.10.624.000
5484.7474.02.90.80.91.60.424.010
6432.6674.03.80.92.43.71.021.010
7323.6284.03.82.02.43.81.024.000
8314.4554.23.60.71.71.60.323.000
9333.3534.04.01.10.92.71.021.000

Last rows

AGEUreaCrHbA1cCholTGHDLLDLVLDLBMIGenderCLASS
8167510.31138.64.21.60.92.60.732.002
817584.0557.94.92.01.21.41.135.012
818555.4626.85.32.01.03.50.930.112
819554.88811.15.74.00.93.31.830.012
820626.3826.75.32.01.03.50.930.112
821574.1709.35.33.31.01.41.329.002
822553.1398.55.02.51.92.90.727.012
823283.5618.54.51.91.12.60.837.012
8246910.31857.74.91.91.23.00.737.012
8257111.0977.07.51.71.21.80.630.012